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Abstract — This paper studies the question of how well a signal 
can be reprsented by a sparse linear combination of reference 
signals from an overcomplete dictionary. When the dictionary 
size is exponential in the dimension of signal, then the exact 
characterization of the optimal distortion is given as a function of 
the dictionary size exponent and the number of reference signals 
for the linear representation. Roughly speaking, every signal is 
sparse if the dictionary size is exponentially large, no matter how 
small the exponent is. Furthermore, an iterative method similar 
to matching pursuit that successively finds the best reference 
signal at each stage gives asymptotically optimal representations. 
This method is essentially equivalent to successive refinement for 
multiple descriptions and provides a simple alternative proof of 
the successive refinability of white Gaussian sources. 

I. Introduction and Main Results 

Suppose one wishes to represent a signal as a linear com- 
bination of reference signals. If the collection C of reference 
signals (called dictionary) is rich (i.e., the size M = \C\ of 
dictionary is much larger than the dimension n of the signal) 
or if one is allowed to take an arbitrarily complex linear 
combination (i.e., the number k of reference signals forming 
the linear combination is very large), then one can expect that 
the linear representation approximates the original signal with 
very little distortion. As a trivial example, if C contains n 
linearly independent reference signals of dimention n, then 
every signal can be represented faithfully as a linear combina- 
tion of those n reference signals. On the other extreme point, 
if C includes all possible signals, then the original signal can 
be represented as (a linear combination of) itself without any 
distortion. More generally, Shannon's rate distortion theory [1] 
suggests that if the dictionary size M — 2 nR is exponential 
in n with exponent R > 0, then the best reference signal (as 
a singleton) ac hives the distortion D(R) given as a function 
of R. 

Several interesting questions arise: 

1) What will happen if the linear combination is sparse 
(k <C n)l How well can one represent a signal as a 
(sparse) linear combination of reference signals? 

2) How should one choose the dictionary of reference 
signals under the size limitation? Is there a dictionary 
that provides a good representation for all or most 
signals? 

3) How can one find the best linear representation given 
the dictionary? Is there a low-complexity algorithm with 
optimal or near-optimal performance? 

These questions arise in many applications and naturally 
have been studied in several different contexts [2]. The current 
paper provides partial answers to these questions by focusing 



on asymptotic relationship between the collection size M, the 
dimension n of the signal, the sparsity k of the representation, 
and the distortion D of the representation. 

More formally, let C = {0(1), (f)(2), . . . , <j)(M)} be a 
collection (dictionary) of M vectors in M™. For each vector 
y e W\ we define its best /c-linear representation y^ from the 
dictionary C as 



y fc = Xi4>(mx) + x 2 <p{m 2 ) 



x k (f>{m k ), 



where x%, . . . , x k € M and mi, . . . , m k G [1 : M] := 
{1,2,..., M} are chosen to minimize the squared error 

d k (y,C) = ||y - (xi0(mi) + x 2 cj){m 2 ) H h x k <f>(m k ))\\ 2 . 



.z n ) e 



is defined 



Here the norm of a vector z = (zi, 
as||z|| = (£r=i^) 1/2 - 

We further define the worst-case distortion djjl(C) of the 
dictionary C as 

d* k (C) := sup d k (y,C), 
y:||y|| 2 <i 

where the supremum is taken over all n-vectors y in the 
(closed) unit sphere. 

Note that d* k {C) < 1 for all C and all n, with d* k (C) = 1 
attained by a singleton dictionary C = {0}. Conversely, if 
M < n, then dt(C) = 1 for any dictionary C of size M, 
Hence, we consider the case M > n only, that is, the case in 
which the dictionary is overcomplete. 

Similarly, we define the average-case distortion d k (C) of the 
dictionary C as 

d k (C) = E(d k (Y,C)), 

where the expectation is taken with respect to a random signal 
Y uniformly drawn from the unit sphere {y € R n : |jy|| < 1}. 

Now we are ready to state our main results. The first result 
concerns the existence of an asymptotically good dictionary. 

Theorem 1: Suppose M = M n satisfies 

liminf — - — - > 0. 

n — >oo fi 

Then there exists a sequence of dictionaries C n of respective 
sizes M n such that 

2fclogM 



lim sup 

n — >oo 

In particular, if k - 



i 0g 4(c„) 



< 0. 



(1) 



oo, then d* k {C n ) -> 0. 



An interesting implication of Theorem 1 is that if we 
choose a good dictionary of exponentially large size, no matter 



how small the exponent is, every signal is essentially sparse 
(say, k = log log n) with respect to that dictionary in the 
asymptotic s. 

The proof of Theorem 1 will be given in Section II. 
The major ingredients of the proof include Wyner's uniform 
sphere covering lemma [3] and its application in successive 
linear representation. Simply put, given a good dictionary for 
singleton representations (k = 1), we iteratively represent the 
signal, the error, the error of the error, etc. by scaling the same 
dictionary. 

This representation method is intimately related to succes- 
sive refinement coding [4]. Indeed, Theorem 1, specialized to 
k = 1, is essentially equivalent to Shannon's rate distortion 
theorem for white Gaussian sources [1]. At the same time, the 
representation method gives a very simple proof of successive 
refinability [4] and additive successive refinability [5] of white 
Gaussian sources under the mean squared error distortion. 

It turns out that the asymptotic distortion in Theorem 1, 
which is achieved by the simple successive representation 
method, is in fact optimal. The following result, essentially 
due to Fletcher et al. [6], provides the performance bound for 
the optimal dictionary. 

Theorem 2 ([6, Theorem 2]): For any sequence of dictio- 
naries C n of size M — M n and any nondecreasing sequence 

& = ^ni 



lim inf 

n — >oc 



where 



log d k (Cr, 



log- 



2 log 



'A/\ 
k I 



- k 



k j n 

k n — k ^ k 



> 0. 



In particular, if k is bounded, then for any sequence of 
dictionaries C n of size M — M n , 



lim inf 

n — >oc 



log d k (C n ) + 



2 k log M 



> 0. 



Note that if M = 2 nR for some R > and k is a constant, 
then Theorem 2 implies that the average distortion is lower 
bounded by 



lim inf 



logd fc (C n ) 



2fclogM 



> 0. 



(Therefore so is the worst-case distortion.) Thus the distortion 
bound in ([1]) Theorem 1 is tight when the dictionary size grows 
exponentially in n. 

The asymptotic optimality of successive representation 
method provides a theoretical justification for matching pur- 
suit [7] or similar greedy algorithms in signal processing. 
This conclusion is especially appealing since these iterative 
methods have linear complexity in dictionary size M (or 
even lower complexity if the dictionary has further structures), 
while finding the optimal representation in a single shot, even 
when tractable, can have much higher complexity. However, 



there are two caveats. First, the dictionary size here is expo- 
nential in the signal dimension. Second, the dictionary should 
represent all signals with singletons uniformly well. 

In a broad context, these results are intimately related to 
recovery of sparse signals via linear measurements. Indeed, 
the sparse linear representation can be expressed as 



y = $x + z. 



(2) 



where 4> is an n x M matrix with columns in C, x S K M 
is a sparse vector with k nonzero elements x\,. .. ,x k , and 
z is the representation error. The award-winning papers by 
Candes and Tao [8] and Donoho [9] showed that a sparse 
signal x can be reconstructed exactly and efficiently from the 
measurement y given by the underdetermined system (01 of 
linear equations (when the measurement noise z = 0), opening 
up the exciting field of compressed sensing. There have been 
several follow-up discussions that connect compressed sensing 
and information theory; we refer the reader to [10], [11], [12], 
[13], [14], [15], [16] for various aspects of the connection 
between two fields. 

While it is quite unfair to summarize in a single sentence a 
variety of problems studied by compressed sensing and more 
generally sparse signal recovery, the central focus therein is 
to recover the true sparse signal x from the measurement 
y. In particular, when the measurement process is corrupted 
by noise, the main goal becomes mapping a noise-corrupted 
measurement output y to its corresponding cause x in an 
efficient manner. 

The sparse signal representation problem is in a sense 
dual to the sparse signal recovery problem (just like source 
coding is dual to channel coding). Here the focus is on 
y and its representation (approximation). There is no true 
representation, and a good dictionary should have several 
alternative representations of similar distortions. As mentioned 
above, the problem of general — not necessarily linear — sparse 
representation (also called the sparse approximation problem) 
has a history of longer than a century [17], [18] and has been 
studied in several different contexts. Along with the parallel 
development in compressed sensing, the recent focus has been 
efficient algorithms and their theoretical properties; see, for 
example, [19], [20], [21]. 

In comparison, studies in [6], [22], and this paper focus 
on finding asymptotically optimal dictionaries, regardless of 
computational complexity^ and study the tradeoff among the 
sparsity of the representation, the size of the dictionary, and 
the fidelity of the approximation. For example, Fletcher et 
al. [6] found a lower bound on the approximation error 
using rate distortion theory for Gaussian sources with mean 
squared distortion. A similar lower bound is obtained by 
Akcakaya and Tarokh [22] based on careful calculation of 
volumes of spherical caps. Thus, the main contribution of 
this paper is twofold. First, our Theorem 1 shows that these 
lower bounds (in particular the one in [6, Theorem 2]) are 
tight in asymptotic when the dictionary size is exponential 

'Fortuitously, the associated representation method is highly efficient. 



in signal length. Second, we show that a simple successive 
representation method achieves the lower bound, revealing an 
intimate connection between sparse signal representation and 
multiple description coding. 

The rest of the paper is organized as follows. We give the 
proof of Theorem 1 in Section II. In Section III, we digress 
a little to discuss the implication of Theorem 1 on succes- 
sive refinement for multiple descriptions of white Gaussian 
sources and its dual — successive cancelation for additive white 
Gaussian noise multiple access channels. Finally, the proof of 
Theorem 2 is presented in Section IV. 

II. Successive Linear Representation 

In this section, we prove that there exists a codebook of 
exponential size that is asymptotically good for all signals 
and all sparsity level. More constructively, we demonstrate 
that a simple iterative representation method finds a good 
representation. 

More precisely, we show that if R' > R > 0, there exists 
a sequence of dictionaries C n with sizes M = 2 nR ° such that 
for n = n(R' , R ) sufficiently large, 



d%(C n ) < 2 



-2kn, 



for every k (independent of n). Since the above inequality 
holds for all R € (0, R' ), we have 



lim sup 



io g 4(c„) + 



2fclogM 



< 0. 



The following result by Wyner [3] (rephrased for our applica- 
tion) is crucial in proving the above claim: 

Lemma 1 (Uniform covering lemma): Given D <G (0,1), 
let R' > R(D) = (1/2) log(l/£>). Then, for n = n(R' , D) 
sufficiently large, there exists a dictionary C n — {y(m) : m e 
[1 : 2 nR ]} such that for all y in the sphere of radius r, 



mm ||y 

me[l:2» a '] 

In particular, for all y € R n . 



y(m)\\ 2 <r 2 D. 



< 



\ 2 D. 



min min ||y — xy(rn) 

Note that Wyner's uniform covering lemma shows the 
existence of a dictionary sequence C n satisfying 



lim sup d\ (C n ) < D = 2 

n— >oo 



,-2R 



which is simply a restatment of the claim for k = 1. 

Equipped with the lemma, it is straightforward to prove the 
desired claim for k > 1. Given an arbitrary y in the unit 
sphere, let y(m) be the best singleton representation of y and 
zi = y — y(mi) be the resulting error. Then we find the 
best singleton representation zi = x 2 y(m 2 ) of zi from the 
dictionary, resulting in the error z 2 = zi — z\. In general, at 
the fc-th iteration, the error Zfc-i from the previous stage is 



represented by Zfc-i = x k y(m k ), resulting in the error z k . 
Thus this process gives a fc-linear representation of y as 

y = y(mi) + zi 
= yK) + x 2 y(m 2 ) + z 2 

= y(mi) + x 2 y(m 2 ) H h x k y(m k ) + z k . 

But by simple induction and the uniform covering lemma, we 
have 

K|| 2 < D||z fc _i|| 2 < D 2 \\z k _ 2 \\ 2 < D^Wz.W 2 < D\ 

which completes the proof of the claim. Note that each of k 
reprsentations attains mean square error 2~ 2jKo for its sparsity 
level j = 1, . . . , k. 

III. Successive Refinement for Gaussian Sources 

The proof in the previous section leads to a deceptively 
simple proof of successive refinability of white Gaussian 
sources [23]. First note that in the successive linear repre- 
sentation method we can take x k = L)( fe_1 )/ 2 for each k. 
Moreover, if U = (U\, . . . , U n ) is drawn independently and 
identically according to the standard normal distribution, then 
it can be shown that 



> 1 + 6 







e(±=\\u\\ -^||U||>1- 

as n — > oo for any e > 0. Hence, a good representation of 
a random vector U when the vector is inside the sphere of 
radius (1 + t)^/n is sufficient to a good description of U in 
general. 

Now our successive representation method achieves the 
(expected) mean square distortion (1 + e)D k after k iterations 
with a dictionary of size 2 nR , where R' > R(D) = 
(1/2) log(l/-D), which is nothing but the Gaussian rate distor- 
tion function. Hence, by describing the index of the sigleton 
reprsentation at each iteration using nR' bits, we can achieve 
distortion levels D,D 2 , . . . ,D k and trace the Gaussian rate 
distortion function for i?', 2R', . . . , kR'. (Recall that we don't 
need to describe the scaling factors x k = £)( fc_1 )/ 2 , since 
these are constants independent of n.) 

More generally, the same argument easily extends to the 
case in which incremental rates i?i , R 2 , . . . , R k are not nec- 
essarily identical; one can even prove the existence of nested 
codebooks (up to scaling) that uniformly cover the unit sphere. 

Operationally, the recursive coding scheme for successive 
refinement (i.e., describing the error, the error of the error, 
and so on) can be viewed as a dual procedure to succe- 
sive cancelation [24], [25] for the Gaussian multiple access 
channels, in which the messages for each user is peeled 
off iteratively. In both cases, one strives to best solve the 
single-user source [channel] coding problem at each stage 
and progresses recursively by subtracting off the encoded 
[decoded] part of the source [channel output] y. This duality 
can be complemented by an interesting connection between the 
orthogonal matching pursuit and the sucessive cancelation [26] 



and the duality between signal recovery and signal representa- 
tion. Note, however, that the duality here is mostly conceptual 
and cannot be made more precise. For example, while we can 
use a single codebook (dictionary) for each of k successive 
descriptions (again up to scaling) as shown above, one cannot 
use the same codebook for all k users in the Gaussian multiple 
access channel. If the channel gains are identical among users, 
it is impossible to distinguish who sent which message (from 
the same codebook), even without any additive noise! There is 
no uniform packing lemma that matches the Gaussian capacity 
function, to begin with. 

IV. Lower Bound on the Distortion 

We show that for any sequence of dictionaries C n of size 
M = M n and any nondecreasing sequence k = k n , 



log d k (C r , 
+ log- 



2 log (If) 



k 

k , n ,„ . 
— r log r> 



While a similar proof is given in [6, Theorem 2], we present 
our version for completeness, which slightly generalizes the 
proof in [6]. 

The basic idea of the proof between is to bound the mean 
square error between the random vector Y and its representa- 
tion vector Y by computing the mean square error between Y 
and Y' (a quantized version of Y) and the quantization error 
(the mean square error between Y and Y')- Then, the tradeoff 
between the error and the complexity of the representation is 
analyzed via rate distortion theory.Details are as follows. 
Without loss of generality, assume that 

liminf d k (C n ) < D < 1. 

n — >oo 

Let y = y(y) = Yli=i x i4 > ( m i) t> e the best fc-sparse linear 
representation of a given vector y in the unit sphere. Then y 
can be rewritten as 



y = £>(yM(y), 



(3) 



i=l 



where if) 1 , . . . , ip k form an orthonormal basis of the subspace 
spanned by (j>(m±), . . . , 4>(nik), uniquely obtained from the 
Gram-Schmidt orthogonalization. Since ||y|| 2 = 52<=i A i — 1 
from the orthogonality of the vectors . . . , ip k , A; 6 [— 1, 1] 
for all i. 

We consider two cases: 

(a) Bounded k: Suppose the sequence k — k n is bounded. 
Since c„ — > as n — > oo in Theorem 2 for any bounded 
sequence k, it is suffices to show that the following 
inequality holds for any sequence C n of dictionaries for 
a bounded sequence k. 



2 log 



it) 



+ logJ fc (C„) > o(l). 



Next, we approximate y by quantizing Ai, . . . , Xk into 

V V c / — 1 in-i _ i n 1 2 
A n ■ ■ ■ i A k fe i i ' /„'■■■' !„' U 'Z„ '/„' 

. . . , ^f^-, l} with quantization step size l/l n . Let 



y'(y) 



k 
i=l 



(4) 



Then, ||y-y'|| 2 < fc(l//„) 2 = k/l 2 n Since y is obtained 
by orthogonal projection of y to the subspace spanned 
by , ... , tp k and y' is a vector in the subspace, y — y 
and y — y' are orthogonal. Thus, we have 



|y-y'll 2 



|y-y 

;2 



/||2 



< \\y~y\\ +k/l*=:d k (y,C n ) + e n . 

Now consider a random signal Y drawn uniformly from 
the unit sphere and its quantized representation 



y' = ^ a ;(y)^(y). 



(5) 



Then, we have || Y - Y'|| 2 < d k (C n ) + e n . 
We have the following chain of inequalities: 



log( " k j +A;log(2/ n + l) 

>H( mi (Y),... 

> H(Y') 

> R(d k (Cn) + e n ) 



m fc (Y),Ai(Y),...,AUY)) 



(6) 



where 
R(D) 



min I(Y;Y') 

p(y'|y)^[||Y-Y'||2]<d fe (C„)+e„ 



is the rate distortion function for Y under the mean 
square distortion dk(C n ) + e n . 

Here are justification for above steps. The first in- 
equality follows from the ranges of the number of k- 
dimensional subspaces and A^. The second inequal- 
ity follows from the fact that Y' is a function of 
(m 1 (Y),...,m fe (Y),A' 1 (Y),...,A' fc (Y)). The last in- 
equality follows from the rate distortion theorem. 
By the Shannon lower bound on rate distortion function 
and the (Euclidean) volume of the unit sphere, 



R(D) > h(Y) - - log(27re(<4(C n ) + e„)) 



>^l 0g f_J_ 

2 \d k {C n ) 



log (7m) 



1 



W 6 " 
Combined together with (O, this yields 



1 



> — log . 

- V dk (C r< 

1 



k log(2Z T 
1 



r 



- \dk{C 



log(7rn) 1 
n 6n 2 
1 



e n /d k (C n ) 



(7) 
(8) 

(9) 



Now, let /„ be an increasing sequence satisfying 

lim /„ = oo and lim = 0, (10) 



and take l n = 
have 

l( l0g ( M k 



fn(dk(C n )) 2 . By plugging l n to (8), we 



fclog (2f n Jd k (C n ) + l 



+ ^log 



1 



1 + e„/d k (C n ) 



,d k (C n ) 

Arranging the terms in the above inequality yields 

KM* 



0(1). 



+ fclog 2/ n + Jd k (C„) +o(l) 



> 



n • 



2n 



■log 



1 



+ 2 l0g (l + e„/4(C„) 



,d k (C n ) , 

Then, we can note that e n /d k (C n ) = (k/l^)d k (C n ) = 
k/f% and e n /d k (C n ) — > as n — > oo. Also, from (9) 
(fc log(2/„ + Vd fc (C„)) /n < (fc log(2/„ + l))/n -> 
as n — > oo. Hence, taking the limit n — > oo to the last 
inequality, we get 



lim inf 



log d k {C n ) 



2fclogM 



> 0. 



Finally, it is easy to show that the inequality in Theorem 
2 reduces to the above inequality for the case when fc 
is bounded. 

(b) Unounded k: In this case, the scalar quantization in 
part (a) gives a loose bound. Wyner's uniform covering 
lemma, however, can be applied to provide a sharper 
tradeoff between the description complexity and the 
quantization error. 

We continue the proof from the orthogonal representa- 
tion of y in (3). Since y is a vector with length < 1 in 
the fc-dimensional subspace spanned by tp 1 , . . . , ip k and 
k n is an increasing sequence, we can invoke the unform 
covering lemma. Therefore, there must exist a dictionary 
C' k of size 2 h and y' e C' k satisfying 

||y-y'|| 2 < 2~ 26/fe . 

Following the same arguments as in (5)-(9), we have 



-( log 

n 



+ b 



~ 2 l0g (rf fe (C„) + 2- 2b / fe 



Finally, optimizing over b yields 



4(C n )>2- 21 ° g m/(^).(^).(^) fe/(r 



o(l). 



-fc) 



Taking the logarithm and letting n 
we have the desired inequality. 



oo on both sides, 



Acknowledgments 

The authors wish to thank Yuzhe Jin and Bhaskar Rao 
for stimulating discussions on their formulation of the sparse 
signal position recovery problem, which motivated the current 
work. 



References 

[1] C. E. Shannon, "Coding theorems for a discrete source with a fidelity 
criterion," in IRE Int. Com. Rec, part 4, 1959, vol. 7, pp. 142-163, 
reprinted with changes in Information and decision processes, R. E. 
Machol, Ed. New York: McGraw-Hill, 1960, pp. 93-126. 

[2] S. Mallat, A Wavelet Tour of Signal Processing: The Sparse Way. 
London: AP Professional, 2009. 

[3] A. D. Wyner, "Random packings and coverings of the unit n-sphere," 
Bell System Tech. J., vol. 46, pp. 2111-2118, 1967. 

[4] W. H. R. Equitz and T. M. Cover, "Successive refinement of informa- 
tion," IEEE Trans. Inf. Theory, vol. IT-37, no. 2, pp. 269-275, 1991. 

[5] E. Tuncel and K. Rose, "Additive successive refinement," IEEE Trans. 
Inf. Theory, vol. IT-49, no. 8, pp. 1983-1991, 2003. 

[6] A. K. Fletcher, S. Rangan, V. K. Goyal, and K. Ramchandran, "De- 
noising by sparse approximation: Error bounds based on rate-distortion 
theory," in EURASIP J. Applied Signal Processing, Special Issue on 
Frames and Overcomplete Representations, vol. 2006, 2006. 

[7] S. Mallat and Z. Zhang, "Matching pursuits with time-frequency dictio- 
naries," IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3397-3415, 
Dec 1993. 

[8] E. J. Candes and T. Tao, "Near-optimal signal recovery from random 
projections: universal encoding strategies?" IEEE Trans. Inf. Theory, vol. 
IT-52, no. 12, pp. 5406-5425, 2006. 
[9] D. L. Donoho, "Compressed sensing," IEEE Trans. Inf. Theory, vol. 
IT-52, no. 4, pp. 1289-1306, 2006. 

[10] S. Sarvotham, D. Baron, and R. G. Baraniuk, "Measurements vs. bits: 
Compressed sensing meets information theory," in Proceedings of 44th 
Allerton Conf. Comm., Ctrl, Computing, Monticello, IL, 2006. 

[11] S. Aeron, M. Zhao, and V. Saligrama, "Fundamental tradeoffs between 
sparsity, sensing diversity and sensing capacity," in Proceedings of the 
40th Asilomar Conf. Signals, Systems, and Computers, Pacific Grove, 
CA, 2006. 

[12] A. K. Fletcher, S. Rangan, and V. K. Goyal, "On the rate-distortion 
performance of compressed sensing," in Proc. ICASSP, Honolulu, HI, 
2007. 

[13] M. Wainwright, "Information-theoretic bounds on sparsity recovery in 
the high-dimensional and noisy setting," in Proc. IEEE International 
Symposium on Information Theory, Nice, France, June 2007. 

[14] F. Zhang and H. D. Pfister, "Compressed sensing and linear codes over 
real numbers," in Proceedings of the UCSD Information Theory and 
Applications Workshop, La Jolla, CA, 2008. 

[15] W. Dai, H. V. Pham, and O. Milenkovic, "Distortion-rate functions for 
quantized compressive sensing," CoRR, vol. abs/090 1.0749, 2009. 

[16] Y. Jin and B. Rao, "Insights into the stable recovery of sparse solutions 
in overcomplete representations using network information theory," in 
Proc. ICASSP, Las Vegas, NV, 2008. 

[17] V. N. Temlyakov, "Nonlinear methods of approximation," Found. Corn- 
put. Math., vol. 3, no. 1, pp. 33-107, 2003. 

[18] E. Schmidt, "Zur Theorie der linearen und nichtlinearen Integralgle- 
ichungen," Math. Ann., vol. 63, no. 4, pp. 433-476, 1907. 

[19] J. A. Tropp, "Greed is good: algorithmic results for sparse approxima- 
tion," IEEE Trans. Inf. Theory, vol. IT-50, no. 10, pp. 2231-2242, 2004. 

[20] , "Just relax: convex programming methods for identifying sparse 

signals in noise," IEEE Trans. Inf. Theory, vol. IT-52, no. 3, pp. 1030- 
1051, 2006. 

[21] D. L. Donoho, M. Elad, and V. N. Temlyakov, "Stable recovery of sparse 
overcomplete representations in the presence of noise," IEEE Trans. Inf. 
Theory, vol. IT-52, no. 1, pp. 6-18, 2006. 

[22] M. Akcakaya and V. Tarokh, "A frame construction and a universal 
distortion bound for sparse representations," IEEE Trans. Signal Pro- 
cessing, vol. 56, no. 6, pp. 2443-2450, June 2008. 

[23] Y.-H. Kim, "Multiple descriptions with codebook reuse," in Proceedings 
of the 42th Asilomar Conf. Signals, Systems, and Computers, Pacific 
Grove, CA, 2008. 

[24] T. M. Cover, "Some advances in broadcast channels," in Advances in 

Communication Systems, A. J. Viterbi, Ed. San Francisco: Academic 

Press, 1975, vol. 4, pp. 229-260. 
[25] A. D. Wyner, "Recent results in the Shannon theory," IEEE Trans. Inf. 

Theory, vol. IT-20, pp. 2-10, 1974. 
[26] Y. Jin and B. Rao, "Performance limits of matching pursuit algorithms," 

in Proc. IEEE International Symposium on Information Theory, Toronto, 

ON, July 2008. 



